33 research outputs found

    Um Editor de Metadados para Documentar Padrões de Análise em uma Infraestrutura de Reuso

    Get PDF
    O processo de desenvolvimento de software muitas vezes encontra obstáculos para reutilizar padrões de análise devido ao difícil acesso a estes artefatos computacionais. A falta de uma ferramenta que facilite o processo de documentação dos padrões de análise e de um repositório digital para armazená-los contribui negativamente na recuperação e reuso dos mesmos. Este trabalho apresenta a ferramenta DC2AP Metadata Editor. Esta ferramenta é um editor de metadados para padrões de análise baseada no modelo Dublin Core Application Profile for Analysis Patterns (DC2AP). Para organizar o processo de documentação dos padrões de análise e facilitar sua recuperação, o DC2AP Metadata Editor provê padrões de análise documentados como Linked Data, permitindo assim que o conhecimento armazenado nesses artefatos sejam compartilhados e automaticamente interpretados por software

    Expert Panel Curation of 113 Primary Mitochondrial Disease Genes for the Leigh Syndrome Spectrum

    Get PDF
    OBJECTIVE: Primary mitochondrial diseases (PMDs) are heterogeneous disorders caused by inherited mitochondrial dysfunction. Classically defined neuropathologically as subacute necrotizing encephalomyelopathy, Leigh syndrome spectrum (LSS) is the most frequent manifestation of PMD in children, but may also present in adults. A major challenge for accurate diagnosis of LSS in the genomic medicine era is establishing gene-disease relationships (GDRs) for this syndrome with >100 monogenic causes across both nuclear and mitochondrial genomes. METHODS: The Clinical Genome Resource (ClinGen) Mitochondrial Disease Gene Curation Expert Panel (GCEP), comprising 40 international PMD experts, met monthly for 4 years to review GDRs for LSS. The GCEP standardized gene curation for LSS by refining the phenotypic definition, modifying the ClinGen Gene-Disease Clinical Validity Curation Framework to improve interpretation for LSS, and establishing a scoring rubric for LSS. RESULTS: The GDR with LSS across the nuclear and mitochondrial genomes was classified as definitive for 31/114 gene-disease relationships curated (27%); moderate for 38 (33%); limited for 43 (38%); and 2 as disputed (2%). Ninety genes were associated with autosomal recessive inheritance, 16 were maternally inherited, 5 autosomal dominant, and 3 X-linked. INTERPRETATION: GDRs for LSS were established for genes across both nuclear and mitochondrial genomes. Establishing these GDRs will allow accurate variant interpretation, expedite genetic diagnosis of LSS, and facilitate precision medicine, multi-system organ surveillance, recurrence risk counselling, reproductive choice, natural history studies and eligibility for interventional clinical trials. This article is protected by copyright. All rights reserved

    Pervasive gaps in Amazonian ecological research

    Get PDF
    Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear un derstanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5–7 vast areas of the tropics remain understudied.8–11 In the American tropics, Amazonia stands out as the world’s most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepre sented in biodiversity databases.13–15 To worsen this situation, human-induced modifications16,17 may elim inate pieces of the Amazon’s biodiversity puzzle before we can use them to understand how ecological com munities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple or ganism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region’s vulnerability to environmental change. 15%–18% of the most ne glected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lostinfo:eu-repo/semantics/publishedVersio

    Pervasive gaps in Amazonian ecological research

    Get PDF

    Pervasive gaps in Amazonian ecological research

    Get PDF
    Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear understanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5,6,7 vast areas of the tropics remain understudied.8,9,10,11 In the American tropics, Amazonia stands out as the world's most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepresented in biodiversity databases.13,14,15 To worsen this situation, human-induced modifications16,17 may eliminate pieces of the Amazon's biodiversity puzzle before we can use them to understand how ecological communities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple organism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region's vulnerability to environmental change. 15%–18% of the most neglected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lost

    Pervasive gaps in Amazonian ecological research

    Get PDF
    Biodiversity loss is one of the main challenges of our time, and attempts to address it require a clear understanding of how ecological communities respond to environmental change across time and space. While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes, vast areas of the tropics remain understudied. In the American tropics, Amazonia stands out as the world's most diverse rainforest and the primary source of Neotropical biodiversity, but it remains among the least known forests in America and is often underrepresented in biodiversity databases. To worsen this situation, human-induced modifications may eliminate pieces of the Amazon's biodiversity puzzle before we can use them to understand how ecological communities are responding. To increase generalization and applicability of biodiversity knowledge, it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple organism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region's vulnerability to environmental change. 15%–18% of the most neglected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lost

    Scalable and fast top-k most similar trajectories search using MapReduce in-memory

    No full text
    Top-k most similar trajectories search (k-NN) is frequently used as classification algorithm and recommendation systems in spatialtemporal trajectory databases. However, k-NN trajectories is a complex operation, and a multi-user application should be able to process multiple k-NN trajectories search concurrently in large-scale data in an efficient manner. The k-NN trajectories problem has received plenty of attention, however, state-of-the-art works neither consider in-memory parallel processing of k-NN trajectories nor concurrent queries in distributed environments, or consider parallelization of k-NN search for simpler spatial objects (i.e. 2D points) using MapReduce, but ignore the temporal dimension of spatial-temporal trajectories. In this work we propose a distributed parallel approach for k-NN trajectories search in a multi-user environment using MapReduce in-memory. We propose a space/time data partitioning based on Voronoi diagrams and time pages, named Voronoi Pages, in order to provide both spatial-temporal data organization and process decentralization. In addition, we propose a spatialtemporal index for our partitions to efficiently prune the search space, improve system throughput and scalability.We implemented our solution on top of Spark’s RDD data structure, which provides a thread-safe environment for concurrent MapReduce tasks in main-memory. We perform extensive experiments to demonstrate the performance and scalability of our approach

    A framework for parallel map-matching at scale using Spark

    No full text
    Map-matching is a problem of matching recorded GPS trajectories to a digital representation of the road network. GPS data may be inaccurate and heterogeneous, due to limitations or error on electronic sensors, as well as law restrictions. How to accurately match trajectories to the road map is an important preprocessing step for many real-world applications, such as trajectory data mining, traffic analysis, and routes prediction. However, the high availability of GPS trajectories and map data challenges the scalability of current map-matching algorithms, which are limited for small datasets since they focus only on the accuracy of the matching rather than scalability. Therefore, we propose a distributed parallel framework for efficient and scalable offline map-matching on top of the Spark framework. Spark uses distributed in-memory data storage and the MapReduce paradigm to achieve horizontal scaling and fast computation of large datasets. Spark, however, is still limited for dynamic map-matching, and memory consumption in Spark can be an issue for very large datasets. We develop a framework to allow map-matching on top os Spark, while achieving horizontal scalability, memory-wise usage, and maintaining the accuracy of state-of-the-art matching algorithms by: (1) We combine a sampling-based Quadtree spatial partitioning construction and batch-based computation to achieve horizontal scalability of map-matching, as well as reduce cluster memory usage. (2) We employ a safe spatial-boundary approach to preserve matching accuracy of boundary objects. (3) In addition, a cost function for the distributed map-matching workload is provided in order to tune the framework parameters. Our extensive experiments demonstrate that our framework is efficient and scalable to process map-matching on large-scale data, while keeping matching accuracy and low memory usage
    corecore